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We consider the K-satisfiability problem on a regular d-ary rooted tree. For this model, 
we demonstrate how we can calculate in closed form, the moments ol the total number 
of solutions as a function of d and K, where the average is over all realizations, for a 
fixed assignment of the surface variables. We find that different moments pick out different 
'critical' values of d, below which they diverge as the total number of variables on the tree 
— oo and above which they decay. We show that K-SAT on the random graph also behaves 
similarly. We also calculate exactly the fraction of instances that have solutions for all K. 
On the tree, this quantity decays to (as the number of variables increases) for any d > 1. 
However the recursion relations for this quantity have a non-trivial fixed-point solution which 
indicates the existence of a different transition in the interior of an infinite rooted tree. 

I. INTRODUCTION 

Constraint satisfaction problems (CSP) are problems in which a set of variables, which can 
take values in a specified domain, have to satisfy a number of constraints. Each constraint usu- 
ally restricts the values that a subset of the variables can take. The problem then is to find an 
assignment of the variables that satisfies all the constraints. The K-satisfiablity problem (K-SAT) 
is an important example of a CSP. In this problem, the variables are considered Boolean, taking 
values True or False, and each constraint is in the form of a clause, which restricts the values of 
K variables at a time, disallowing 1 out of the 2^ possible values that these K variables can take 
together. 

Satisfiablity has been a fundamental problem for almost forty years in computer science. It is 
known that as soon as there are clauses which restrict the values of > 3 variables, this problem 
is NP -complete jl], i.e., potential solutions can be verified easily for correctness, but finding a 
solution can take exponential time in the worst case. In addition, being NP-complete, should a 
polynomial-time algorithm be found for solving SAT, it is also possible to adapt it to solve in 
polynomial-time all problems in NP. 
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The version of the K-SAT that we are interested in, in this paper is the random K-SAT, which 
has been very well investigated in the past few years. In the random K-SAT one looks at the 
ensemble of randomly generated logical expressions, where each logical expression or formula is an 
AND of M clauses. Each clause consists of an OR of K Boolean literals (a literal being one of 
the N variables or its negation), chosen randomly from a set of N Boolean variables. As the ratio 
a = M/N increases it becomes harder to find satisfiable assignments for all the N variables that 
can satisfy the logical expression of M clauses. One of the questions of interest is hence if there 
exist a Oc, beyond which in the limit of M — )• cxd and N oo, no satisfying assignments exist. 

Numerical experiments have shown in the past that if one studies the probability that a ran- 
domly chosen formula having M = aN clauses is satisfiable, the probability approaches 1 for 
a < ac{K) and vanishes for a > ac{K) when N ^ oo 

aa. 

The existence of a sharp transi- 
tion (the solvability transition) is intrinsic to the problem and not an artifact of any particular 
algorithm (in the existence of a possibly A^-dependent transition is proved for any K), but its 
location has not been determined rigorously with the exception oi K = 2 [5]. There are however 
several rigorous bounds on ac{K), both upper and lower (see for a review of these). In addition, 
powerful methods from statistical physics, taking advantage of the connection of this problem to 
the theory of mean field spin glasses, have been used to conjecture values for the threshold that 
seem to be very accurate numerically [3, Q]. 

The above problems are all originally defined on random graphs, where the presence of large 
loops makes the problem hard to solve exactly. Hence solving the problem on trees or locally tree- 
like graphs has played an important role in elucidating the nature of the solvability transition as 
well as other phase transitions present in the problem. Infact the methods from statistical physics 
assume the absence of correlations between some random variables, which is equivalent to solving 
the K-SAT problem on a locally tree-like graph. In addition, it has also been shown that certain 
problems on a tree, in particular the tree-reconstruction problem ^, become equivalent to the spin 
glass problem on a random graph. 

In this paper we study the K-SAT problem on a regular d-ary rooted tree in which every vertex 
(except the leaves) has exactly d descendents. The values that the surface nodes (or leaves) take on 
the tree, are fixed. For a given assignment of the surface nodes, and a given realization of clauses 
on the tree, one can ask how many assignments of the variables on the tree are solutions. For such 
a tree graph, we can exactly calculate the moments of this quantity averaged over all realizations. 
The behaviour of the moments is similar to that on the random graph, and we find that there is a 
different "critical" point associated with each moment as in the random energy model considered 
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FIG. 1: 3-SAT on a rooted tree of depth 2 and d = 2. Only the clauses neighboring the root are labelled. 
The surface variables are depicted by a •. Dashed/full lines between a variable and a clause indicate that it 
is negated/un- negated. 

by Derrida [l^. We also calculate exactly the fraction of realizations that have solutions, for a 
tree of any size. This property too shows a similarity to the behaviour of the model on a random 
graph, since there is a value of d above which the fraction of realizations that have solutions — > as 
N —7- oo. We also look at this quantity in the interior of an infinite tree, and show that a different 
transition takes place in this limit. 

The plan of the paper is as follows: in Section |TI] we introduce the model. In Sections IIIIl IIVI 
and |V] we calculate the moments of the number of assignments that are solutions averaged over 
all realizations for a randomly fixed boundary, for 2-SAT, 3-SAT and arbitrary K respectively. 
In Section IVIl we calculate the probability that a given realization has at least one solution (or 
equivalently the fraction of realizations that have solutions) , as a function of the depth of the tree. 
In Section [Vm we carry out the fixed-point analysis relevant for the interior of an infinite tree. We 
end with a summary and discussion in Section IVIIIi 



II. THE MODEL 



We define the K-SAT problem on a tree as follows. Consider a regular d-ary tree T in which 
every vertex has exactly d descendents. The root of the tree xq has degree d and its d edges are 
connected to function nodes {ci, C2..., q}. Each function node has degree K, and each of its IT — 1 
descendents {xi = xi,X2, ■■■■Xk-i} is the root of an independent tree (see Fig. [1]). Hence the root 
has a degree d while all the other vertices on the tree (except the leaves which have a degree = 1) 
have a degree d+1. Each function node is associated independently with a clause (p{xQ,xi, ...,Xk-i), 
where the vertices xo,xi.. are the neighboring vertices of the function node, joined to the function 
node by a dashed or solid edge indicating whether the corresponding vertex is negated or not in 
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the clause. We consider only the case where the vertices can take one of two values or 1 and the 
case when every function node is given by 4> = ^o\/ ...Ik-i- Here ii is one of the two literals 
Xi or Xi, depending on whether Xi is joined to the function node by a dashed or a solid line (Fig. 



An assignment a of all the variables on the tree (barring the surface variables which take fixed 
values, see below) is a solution iff (/> = 1 for all the clauses on the tree. One configuration of dashed 
and solid lines on the tree defines a realization R. 

For a random K-SAT problem, i.e. a K-SAT problem on a random graph, the variable a = M/N 
where M is the total number of clauses and is the total number of variables (vertices), is a 
meaningful quantity. As a function of this one can ask, for example, how the moments of the total 
number of solutions (averaged over all realizations) scale with N. 

To ask the same question meaningfully here on a tree, it is usual to fix the values of the variables 
on the surface of the tree. If we consider the surface variables to have depth 0, then we can denote 
this condition by cr{0) = L, where cr{0) is the assignment of the variables at the 0*'' depth and L 
signifies a particular assignment for the variables at this depth. Variables removed from the surface 
by one function node (or one level) are at depth 1 and so on. The tree is said to have a depth n if 
the root is n levels away from the surface. 

If the surface variables are fixed, then its easy to check that M{n) = dN(n), where M{n) and 
N{n) are respectively, the total number of clauses for a tree of depth n and the total number of 
vertices (d = 1 if the surface variables are left free) . So a is the equivalent of d on a tree with fixed 
boundary conditions. 

Let us denote the total number of solutions (a sum over all a which are solutions) for a particular 
realization of a tree of depth n and a specific boundary condition L as Zfi{L, n). This is a stochastic 
variable which varies from realization to realization as well as from one boundary condition to 
another. The first moment of this quantity, averaged over all realizations (for a fixed boundary 
condition), is trivially computed and is equal to ((^^^)'^2)^(") (we derive this later in Section 



Note that, from simple considerations, it is easy to see that even the random K-SAT has the same 
expression for the first moment as on the tree, with d replaced by a. This expression for the first 
moment gives an annealed approximation for ac{K). 

We are henceforth interested in also estimating the higher moments for Zji{L, n). Before writing 
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FIG. 2: Schematic diagram of the tree for 2-SAT. Variable xq- the root of a tree of depth n- is connected 
through clauses ci, C2 and C3 to variables xi, X2 and X3 respectively, each of which is in turn the root of a 
tree of depth n — 1. 

down the recursions for this quantity on the tree however, we introduce a little more notation. By 
C^^Xi) we denote all the clauses which are neighbours of variable Xi and are satisfied by it. Similarly, 
by C^{xi) we denote all the clauses which are neighbours of variable Xi but are not satisfied by 
it. Let FFi(L,n) and GR{L,n) denote the number of solutions for a tree of depth n (for a given 
realization and boundary condition L) in which the root takes the value and 1 respectively. 

III. RECURSION RELATIONS ON THE TREE FOR 2-SAT 

We can write the exact recursion equations on the tree for the stochastic variable Zr{L, n). We 
first look at the form of these recursions for 2-SAT. For ease of notation we henceforth omit the L 
in the argument. A rooted tree of depth n is generated by taking a root of degree d, then picking, 
for all edges connecting the root to the function node, independently the type of edge (dashed or 
solid) and finally attaching trees of depth n — 1 to the other end of the function node (again via 
edges that are independently chosen to be dashed or solid), see Fig. [2l 

Let di be the cardinality of the set C"(xo), when vertex xq takes value 0. Then the vertices 
along the other edge of these clauses can only take on the specific value satisfying the clause, for 
any realization of the edge connecting them to the clause. For those clauses satisfied by xq (d — di 
of them; d — di is just the cardinality of the set C*(xo)), the vertex at the other edge of the clause 
is free to take any value. So the number of solutions for the tree of depth n, with root node taking 
value xq = 0, for this specific realization (and boundary condition), is the product of the number 
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of solutions of d sub-trees of depth n — 1. 
The recursion relation for is hence: 

of— di d 

FR{n) = n Zn^n - 1) n (PR^i^ " + Gn^in - 1)(1 - (1) 

i=l i=d—di + l 

where Ri denotes the realization of each of the sub trees rooted in the descendents of xq and iji 
is a stochastic variable which is equally likely to take the value or 1, depending on whether the 
edge joining the variable at depth n — 1 to its clause is dashed or solid. Similarly, 

d d—di 

Gnin) = n Zn^n - 1) n {FrA^^ - + Gr^u - 1)(1 - r?,)) (2) 

i=d—di+l i=l 

and, 

ZR{n) = GR{n) + FR{n) (3) 

We can define, PR{n) = z^^(n) fraction of solutions in which the root xq takes the value 

for a given realization R and a fixed but arbitrary boundary condition L. Note then that if, for 
a given R, we selectively sum over those boundary conditions which result in a certain /3R{n), this 
gives the recursion for the residual distribution at the root derived earlier in the context of tree 

Our interest here however is to calculate the moments of ZR(n) from the above recursion 
relations hy averaging over all realizations, for an arbitrary boundary condition L. Hence by 
{FR{n)), we will denote an average over all realizations R for a tree of depth n. It is easy to see 
that such an average is achieved at the root, by averaging over all possible dis for a given d, where 
the probability that there are di clauses unsatisfied by the root is given by the binomial distribution 

P(di) = 7^ \ \ ■ Also note that since different branches of the tree are independent of each 

other, we have, 

{FR^{n)FR^in)) = (Ffi^(n))(F^^, (n)) = {FR{n)f (4) 
where i and j are variables along different branches. In addition, 

{FR^in)) = (GR^in)) (5) 
by symmetry. Infact this is true for any moment of F (or G). It is also easy to see that 

im) = 0.5 (6) 
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{rjiil - r],)) = (7) 

and 

{V^VJ) = {V^){V,)=0■'25 (8) 

Using the above equations, we can now solve Eq. [T]to obtain the different moments. 

A. First moment or average number of solutions 

The average number of solutions as discussed earlier, can be obtained by just looking at the 
probability of satisfying individual clauses. In this section we average Eq. [T] and make use of Eq. 
[5] to obtain (after removing subscripts for ease of notation): 

(Z(n)) = 2{F{n)) (9) 

and 

(F(n)) = l((Z(n-l)) + (F(n-l)))'^ (10) 



{F{n)) = {-^n{Z{n-l))r (11) 

which implies 

{Z{n))=2{^f{{Z{n-l))r (12) 

This is a simple recursion relation which we can solve easily for any boundary condition. The 
recursion relation at the boundary is 

{Z{1)) = 2{^^r (13) 

and, 

{F{1)) = i^f (14) 

Eq. [T3l follows from Eq. [I2]by noting that Z(0) = 1. Using the boundary condition Eq. [T3l we 
can solve the recursion relation Eq. [12] to get the same result for the average number of solutions 
as mentioned in Section HIl 
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B. Second moment 

To get the second moment, note 

(Z2(n)) = 2{F\n)) + 2(F(n)G(n)) (15) 
From the recursion Eq. [T]for the stochastic variable Fji{n), we have, 

d—di d 

Fl{n) = n Zlin - 1) n - 1)^* + GR{n - 1)(1 - m)f (16) 

This gives (again getting rid of subscripts), 

{F\n)) = 1 [{Z\n - 1)) + {F^n - 1)))' (17) 
= 1 [{2F\n - 1) + 2F(n - l)G(n - 1)) + {F\n - 1)))' (18) 
= ((3/2)(F2(n - 1)) + (F(n - l)G(n - 1)))' (19) 

Similarly the equation for {F{n)G{n)) is: 

{F{n)G{n)) = [{F^{n - 1)) + {F{n - l)G(n - 1)))"^ (20) 

These two coupled equations need to be solved to get the second moment. The boundary conditions 
are, 

{F^l)) = (^)'^ (21) 

which we get by noting that (^^(0)) = 1/2 and (F(O)G(O)) = 0. Similarly, 

(F(l)G(l)) = i^r. (22) 

To solve the coupled recursion equations, let us define the ratio 

{F nin)GR{ n)) 

Then we have the following equation for ryj. 



^ ";;;7^^r^^ (23) 



[3/2 + Tn-lf 



(24) 



with boundary condition ri = (|)'^- 

For large n, r„ reaches a fixed point and in that limit we can solve for the fixed point of this 
equation to get r* = 0.78((i = l),0.576(d = 2),0.4((i = 3). If we now approximate the equation for 

{F\n)) ~ (F2(n - l))''(3/2 + r\d)f (25) 
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We can solve this to get (for d > 1), 



^(3/2 + r^(d))^ 



-id" 



(26) 



If the term in the brackets is < 1, the second moment decreases with system size. If it is > 1, then 
the second moment increases with system size. The "critical" value lies between d ^ 3.1 and 3.2. 
To get the value of this threshold more precisely, we need to solve the equation 



{F\n)) ~ {F\n - l))'^(3/2 + r„_i(d))' 



(27) 



exactly. Solving it numerically we get the critical value of d to lie between 3.06 and 3.07. 

We can follow this procedure for any moment though the number of coupled recursions that 
have to be solved simultaneously, increase with the order of the moment. 



IV. RECURSION RELATIONS FOR K = 3 



The recursion relations for higher K though slightly more complicated, follow the same logic as 
for K = 2. We carry out the computation for K = 3 here. Now the recursion relation for Fji{n) is: 

d — di 

1=1 

ntd-d,+i [Zr^Zr., - {FR,,mi + Gr^,{i - m)) {FR,,m2 + Gr^,{i - ^2))] (28) 

where we have removed the dependence on n — 1 in the LHS for ease of presentation. We can also 
write a similar equation for G. The r/'s are the same as appear earlier. The il and i2's signify 
the two variables at the end of the same clause, (see Fig[T]). Since they belong to two different 
branches, their averages still decouple. The second term is the counterpart of the term appearing 
in 2-SAT along the branches where the root was not satisfying the clause (or link in 2-SAT). In the 
case of 3SAT, if the root does not satisfy one of it clauses, the other two variables are collectively 
constrained to not take 1 out of the 4 total assignments they could have had. Which specific 
assignment is forbidden depends on the realization. 

A. First moment or Average number of Solutions 

Again removing subscripts, 

(F(n)) = 1 ((Z(n - l)f + {Z{n - l)f - {F{n - 1))^)' (29) 
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{Fin)) = 1 (2(Z(n - 1))^ - {F{n - 1))')' (30) 
= ilfi{Z{n-l))f^ (31) 



implying 

{Z{n)) = 2C-n{Z{n-l))r (32) 

As before, we need to solve the recursion relations keeping the boundary conditions in mind, The 
recursion relation at the boundary is 

(Z(l)) = 2(1^ (33) 

and, 

{F{1)) = il)' (34) 
We can now solve Eq, [32] to get the result for the annealed average, already mentioned in section 

m 

B. Second moment 

Squaring Eq. [28] and taking averages, we get, 
{F'in)) = {l[zl{n-l)ZUn-l) (35) 

i=l 

iZiiZi2 - {Fin - l)r]a + G{n - 1)(1 - r?,i)) (F(n - l)v^2 + G{n - 1)(1 - r/.s)))') 
Simplifying, we get, 

{F\n)) = 1 {{Z\n - l)f + {Z\n - l)f + {F\n - l)f - 2{F{n - l)Z{n - (36) 
= ^ (7(F2(n - 1))2 + Q{F{n - l)G{n - 1))^ + 12(F(n - l)G(n - l)){F'^{n - 1)))37) 

Similarly the equation for {F{n)G{n)) is 

(F(n)G(n)) = (3{F\n - l)f + 3{F{n - l)G{n - l)f + 6{F{n - l)G(n - l)){F^{n - 1)))"^ 

(38) 
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Again defining the ratio of {F{n)G{n)) to (-F^(n)) as r„, we get 



rn+l 



3 + 3r2 + 6rn 



(39) 



J/2 + 3rl + 6rr, 
with ri = {Q/7f. 

Going through the same procedure as before, we find that the second moment diverges for 
(i < 7.16. In comparison, the first moment diverges for d < 5.19. 



V. RECURSION RELATIONS FOR ARBITRARY K 



Similarly, we can easily write the recursions for Fj{(n) for any K. Skipping details, the equation 
for the r^'s and for (F^(n)) for arbitrary K are: 



rn+i 



(2^-^-l)+g(r„ 
(2^-1- 0.5) +5(r„) 



1 d 



(40) 



where 



5(r„) = (2^-1 - 1) 

j=0 



K -I 



K-l-i 



(41) 



and 



{F\n)) = (F2(n-l))(^-i)^ ((2^-1- 1/2) +5(r„.i))' 



(42) 



These expressions can be simplified a bit. We can write 

(l + r„)^-i 



Tn+l 



K 



1 + (1 + 



(43) 



defining (2^-i - 0.5)/(2^-i - 1) = /3x 
and 



{F\n)) = (F2(n-l))(^"i)^(2^"i-l)(^-i)'^(/3;,-l + (l + r„_i)^-i)' 



(44) 



Solving these numerically for different K, we get the numbers in the third column of table [H 
That different moments pick out different critical values of d, is not confined to the tree. We 

argue below that the K-SAT on a random graph, behaves similarly. 

As mentioned earlier the behaviour of the first moment on the random graph is identical to that 

on a tree from simple considerations. The second moment, while not calculable in closed form for 

a random graph can be written in terms of the fraction of realizations that a pair of assignments 
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TABLE I: The critical values of d for the first and second moments on the rooted tree, determined by 
numerically solving the recursion relations derived in the text. The fourth column contains an estimate of 
the critical value from the expression for the second moment in Eq. 1461 



K 


(F) 




{ZD 


2 


2.41 


3.06 


2.9 


3 


5.19 


7.16 


5.89 


4 


10.74 


15.24 


11.6 


5 


21.832 


31.34 


22.8 


6 


44.014 


63.52 


45. 


7 


88.376 


127.86 


89.5 


8 


177.099 


256.51 


177.6 



are both simultaneously solutions of. The probability of two assignments simultaneously being 
a solution for a given realization depends on the overlap between the two assignments. If two 
assignments have an overlap pN (0 < p < 1), the probability that both are solutions is f{p)^^ ^ 
(1 — 2^~^ + 2~^p^)^^ where M is the total number of clauses. The exact expression for the second 
moment is thus simply, the number of pairs of assignments with overlap pN times the probability 



that such a pair are both simultaneously solutions for a realization 

/ 
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13|. 



{ZD 



^ ^2 = 



N 



fip = z/N) 



(45) 



V 



Following Achlioptas et al 



and using the leading-order approximation 



N 

z 



{f{\-p) 



-TV 



poly{N), the term which contributes the maximum to the above sum is 



2 I maxr, 



Xpp{i -py-p) 



N 



poly{N) 



(46) 



For our purposes its easy to see, by plotting the term which is exponentiated in A^, as a function 
of p, that there is a value of a above which the maximum value (which is an estimate of the second 
moment) is less than 1. The numbers at which this happens for different K are reported in the 
fourth column of table HI We have also estimated the critical value using simulations. For example, 
Fig. [3] plots the second moment as a function of N for various values of a for random 3-SAT. We 
find that the value of a for which the second moment starts decaying with increasing is 5. 7 it 0.1. 
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10 20 



a=5.4 

a-5.6 ---X-- 
a=5.7 

a=5,8 ■ a ■ 



FIG. 3: Second moment for random 3-SAT as a function of N for different values of a 



Hence, at least for random 3-SAT, the value of the critical point for the second moment obtained 
from the overlap function is close to the numerical estimate. Simulations for higher moments near 
the critical point are harder to do since the fraction of successful realizations decays exponentially 
with N. For example, at a = 5.7 and = 70, only 208 out of the 10^ randomly chosen realizations 
had solutions. However we expect the higher moments to behave similarly, see for e.g. where 
expressions for the third and fourth moments are also evaluated for random 3-SAT. 

The quantitative differences in the values of the critical point for the second moment in between 
our model and random A'-SAT can be explained in terms of the boundary conditions of the tree 
graph. For the tree graph, the probability that two assignments are simultaneously solutions needs 
to be re-written for the nodes near the boundary. If we make the simple assumption that all nodes 
are nodes at depth 1, then we should replace f{p) by g{p) = (1 — 2^^^ + 2~^p) in the expression 
for the second moment in Eq. |36j 

The critical values of d for the different moments, indicate the heterogeneity of different re- 
alizations. Even at very large values of d (or a), there exist (an exponentially rare number of) 
realizations which by having an exponentially large number of solutions, contribute to the corre- 
sponding moment. The critical values for different moments, also provide bounds on the solvability 
transition. For e.g., the critical value for the first moment gives a simple upper bound on the solv- 
ability transition for any K. The ratio of the square of the first moment to the second moment 
provides a lower bound on the probability that solutions exist, and is the starting point for the 
weighted second moment method [l^. However to gain a better insight into the solvability transi- 
tion, it is more useful to look at how the fraction of solvable realizations changes as the number of 
constraints increases. This quantity can again be exactly calculated on a rooted tree and we carry 
out this calculation in the next section. 



14 



VI. EXACT RECURSION RELATIONS ON THE TREE FOR THE PROBABILITY 
THAT A VARIABLE CAN TAKE 2, 1 OR VALUES FOR ARBITRARY K 

We would like to estimate the probability that a realization has no solution. Such a realization 
is one for which not a single assignment of the variables provides a solution. This can happen 
if there is even a single variable on the graph, which, whether it takes the value or 1, causes 
atleast one clause to be unsatisfied. Such a variable then is a variable that can take values by our 
definition, and a realization that is not solvable has at least one variable of this type. On the tree 
graph, we can define the probabilities of a variable taking 0, 1 or 2 values on the corresponding 
subtree. We define pj(0) as the conditional probability for a variable xi to cause a contradiction, in 
the subtree of which it is the root, given that all the other variables in the subtree can take at least 
1 value. We can then estimate the probability of a realization having a solution (or the fraction 
of realizations that have solutions) by calculating the quantity na;.(l — ^?j(0)) where the product is 
over all the variables in the graph. The tree structure of the graph, gives us a way of calculating 
]3i(0) through recursions. 

Wc define below some of the quantities in terms of which these recursions arc written. We 
define i'n(O) to be the probability that (or the fraction of realizations in which) a variable at depth 
n can neither take the value nor take the value 1, without causing a contradiction in its subtree. 
Note that because of the tree structure, all variables xi at depth n will have the same probability 
Pji- The probability that a variable at depth n, can take only one of the two values or 1 is 
defined to be P„(l). Similarly the probability that a variable at depth n can take both values is 
P„(2) = 1 — Pn(0) — P„(l). As before, in all that follows, we consider the boundary variables to 
have depth 0. In the sections below, we describe how to set up the recursions for 2-SAT and then 
for a general K-SAT. 

We would like to calculate i-*„-|_i(0) and P„_|-i(l) for variable xq (assuming it is at depth n + 1), 
given these quantities for its descendents. Let us consider P„_|_i(0) first for the 2-SAT problem. 
Assume variable has a degree d (by definition) and assume it is not negated on di of these links. 
Variable xq will not be able to take the value in the case when at least one of the d\ links is not 
satisfied by the variable at the other end. In this case there will be at least one unsatisfied clause 
if variable xq takes the value 0. Similarly, at least one of the d — d\ links along which variable 
xq is satisfying its link, should also not be satisfied by the variable at the other end. This latter 
condition implies that variable xq cannot take the value 1 either. It is easy to see that averaging 
over all realizations at depth n + 1 implies averaging over all values of di as before, and averaging 
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over all realizations at depth n. It is important to note however that all the realizations at depth 
n + 1 are only built up from those realizations at depth n that do have solutions. 
Putting all this together the recursion relations for 2-SAT are: 



Pn+lil) 




1 



1 



^n(l) 



2(1 - P„(0)) 
PM) 



2(1 

d 



PniO)) 



(47) 



.2(i-p„(o))y; V V2(i-p„(o)), 

The term P„(l)/(2(1— i-*„(0))) = (5n(l) which appears in the above equations is just the conditional 
probability that a depth n variable cannot satisfy its link above (to depth n + 1), given that it 
has to be able to take at least one value (which satisfies the sub tree of which it is the root). The 
only way the latter can happen is if the one value it can take, does not satisfy the link above 
(this happens with probability 1/2 in our case). If the variable can take two values, we can always 
find one value which satisfies the link above for any realization. We can iterate these equations 
beginning with the boundary conditions 



Pi(l) = 2(0.75)^^ - 2(0.5)"^ (48) 
Pi(0) = 1 + (0.5)"^ - 2(0.75)'^ 

The boundary conditions are easily obtained by setting Po(l) = 1 in Eqns H71 We can hence 
obtain Pn{0) for any depth n. The probability that a realization has a solution is then (as mentioned 
earlier) n„(l — P„(0))^^"^ where g{n) is simply the number of variables at depth n. If -Pn(O) 7^ 
then the above product decays exponentially to with increasing tree size. We see from Eq. [57] 
that it is when Pn(l) takes a non-zero value that the fraction of realizations that have no solutions 
also becomes non-zero. 

For a tree graph, since the boundary vertices form most of the graph, we can just take a look 
at equation I35J If we plot Pi(0) as a function of d, we see immediately that this is nonzero for any 
d > 1 and hence, the fraction of realizations that do not have solutions is non-zero above d = 1. 
This conclusion is true, even if we calculate n„(l — Pn{0)y^"'^ using the recursions and calculating 
g{n)-the number of vertices at depth n, for each depth. Hence for d > 1 the probablity of having 
a realization with solutions goes down exponentially with A^. For example, for d = 3, K = 2 and 
n = 4, we find from exact enumerations, that out of 10'' randomly generated instances, only 77 
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have solutions. Note however that these 77 suffice to make the second and higher moments still an 
increasing function of A^, as we saw in the previous section. 

The recursion relations for 2-SAT are easily generalised to arbitrary K. 

P„+i(0) = 1 - 2 (l - 0.5(Q„(l))^-i)' + (l - (49) 
P„+i(l) = 2(l-0.5(Q4l))^-i)''-2(l-(Q4l))^-i)'' 

with the boundary conditions obtained by putting i-b(O) = and i-b(l) = 1 as before. 

We find that for all K, for d > 1, the fraction of realizations that have solutions decays expo- 
nentially with A^. Since this is also usually how the SAT-UNSAT transition is defined numerically, 
we can conclude that n„(l — Pn{0)y^"'^ is the order parameter for the solvability transition in 
our model. The transition occurs at c? = 1 for all K when the boundaries are fixed randomly. 
In comparison, for random 3-SAT the value of a (that we find from simulations) at which the 
fraction of successful realizations starts decaying exponentially is a = 4.25 ± 0.05. Hence both 
the behaviour of the moments as well as the the SAT-UNSAT transition, is qualitatively similar 
to random K-SAT, though quantitatively, on the tree graph, these are infiuenced mostly by the 
surface variables, as expected. Hence in the next section, we look at the behaviour of Eqns. [47l 
and 139] in the interior of the tree, for an infinite rooted tree. 

VII. FIXED POINT ANALYSIS 

In this section we will try to get rid of boundary effects by looking deep within the tree. This 
is usually equivalent to doing a calculation on a Bethe lattice, in which case, all vertices on the 
tree should be equivalent and have the same coordination number. On the rooted tree we have 
discussed so far, the root has a coordination number 1 less than the other sites. However, from 
the structure of the recursions Eq. [l9l for any vertex/variable on the tree, it is only d of the 
connections (to descendents) which determines the probability Pn(0). Hence -Pn(O) at a depth n is 
not changed when more levels are added to the tree. The same is not true for i-'n(l) which, at least 
when -Pn(O) 7^ 0, needs to be corrected, for interior sites. However this correction does not affect 
the transition that we discuss below. We can hence hope that the fixed point of the recursions 
Eqns. [17] and 139] will give us an insight on how these probabilities behave in the interior of the 
tree, independent of boundary conditions. 

Consider the recursions (Eq. [47l) for 2-SAT first. Noticing that it is the quantity Qn(l) which 
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O 0.2 




0.15 



0.05 



FIG. 4: Fixed points for d < 2 and d > 2 for 2-SAT 
appears on the RHS of these equations, we can rewrite the recursions as 

O (I) = [l-0-5Qn(l)]"-[l-Qn(l)]" .^Qs 

2[1-0.5Q„(1)]"-[1-Q„(1)]'^ 

If the above map has a fixed point, then the value of Qn(l) = <3n+i(l) = Q* and Q* = f{Q*) 
where f{Q*) is the function on the RHS and < Q* < 1. 

We can look for this graphically as shown in Fig HI For d < 2, there is only one solution to the 
equation Q* = f{Q*) and this lies at Q* = 0. For d > 2 there are two solutions, one at Q* = 
and the other at some non-zero Q* . If Q* = 0, clearly P„(0) = 0, for large n. This implies that 
(assuming all N nodes in our system are deep in the tree), the probability that a realization has 
a solution (11(1 — P(0))) is = 1. Equivalently if Q* ^ 0, this probability vanishes as ^ oo. 
Note that the non-zero value of Q* develops continously from Q* = 0. In other words, for d only 
very slightly larger than 2, the non-trivial solution is only very slightly larger than 0. In addition, 
from the shape of the function f{Q*) we see that no matter what the boundary conditions, the 
non-trivial fixed point is always reached for d > 2. Hence for 2-SAT, in the interior of an infinite 
tree, a real transition (the solvability transition) occurs at d = 2. 

On general grounds [l^, we expect that the above results are comparable with those on a 
regular random graph with coordination number a = {d+ l)/2 and hence a transition at d = 2 in 
the interior of a tree should correspond to an Oc = 1.5. It is interesting to note however that if we 
use instead the relation a = d/2 (as used in the correspondence between the cavity method and 
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O 0.4 




FIG. 5: Fixed points for d < 11.5 and d > 11.5 for 3SAT 



tree-reconstruction) this gives the solvabihty transition for random 2-SAT to he at a = 1 which is 
an exact and known result. 

Consider the case K > 2 now. As before, we can write 



1-0.5Q„(1 



l-Qn(l) 



K-l 



id 



2[1-0.5Q„(1 



[l-Q„(l)^-i] 



(51) 



We can investigate the fixed points graphicahy as shown for K = 3 in Fig. [5l For d < 11.5, 
there is only one solution to the equation Q* = f{Q*) and this lies at Q* = as before. For 
d > 11.5 there are three solutions, one at Q* = and the other two at some non-zero Q*. 

Some important points of difference with 2-SAT are, that the non zero values of Q* develop 
discontinously from Q* = 0. Also, from the shape of the function f{Q*), the first (Q* = 0) and 
third (non-trivial) value of Q* are stable while the middle value is unstable. This is true for all 
K > 3. Hence from the shape of the function f{Q*) we also see that boundary conditions play an 
important role for K > 2. The non-trivial solution is reached only if the boundary conditions are 
such as to make the value Qo{l) larger than the value of the central fixed point. This points to a 
first order transition for K > 3 as opposed to a continuous transition for K = 2. This difference 
in the nature of the transitions at = 2 and i^' > 2 is entirely analogous to the problem on a 
random graph {igI . 

For K > 2 the correspondence between the average degree of a variable on a regular random 
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TABLE II: The value of dc compared to ad values obtained from [8|. 



K 


dc 


(dc + 1)1 K 


dc/K 


ad (from [8]) 


2 


2 


1.5 


1 


1 


3 


11.5 


4.166 


3.83 


3.927 


4 


32.6 


8.4 


8.15 


8.297 


5 


80 


16.2 


16.00 


16.12 


6 


182 


30.5 


30.33 


30.5 


7 


400 


57.28 


57.14 


57.22 


8 


856 


107.13 


107.00 


107.24 



graph and that of a variable on the tree comes out to be a = (d + ^)/K. 

In table |TT]we report the value dc as obtained from the fixed point equations for our model, as 
well as both [dc + ^)/K and dc/K. It is interesting to note that these latter values are very close 
to the value of ad obtained in the literature earlier P, [l3] • 

Note that in our case the value dc appears as that value of d in the interior of an infinite tree, 
from which point onwards a solvability transition could take place, depending on the boundary 
values. It is only for 2-SAT that the transition actually does co-incide with d taking the value dc- 



VIII. SUMMARY AND DISCUSSION 



In this paper, we have studied the K-SAT problem on a rooted tree and have solved it exactly 
for several quantities. A lot of progress in understanding the K-SAT problem has been made using 
the cavity method 0, 0, [22I and a powerful heuristic, survey propagation(SP) {22, 23| has been 



developed using these concepts. The SP equations, which are the basis for the algorithm, are a set 
of coupled equations for the cavity bias surveys -messages sent from clauses to variables- in terms 
of the probabilities of warnings received by the variable. The probability space over which these 
are computed is the space built by all SAT assignments each given equal probability in a typical 
satisfiable instance. It is conjectured that this solution space, for a > a^, separates into many 
distant clusters, and hence the SP message along an edge gives the probability that a warning is 
sent in a randomly chosen cluster. Under this assumption, the SP equations lead to coupled integral 
equations with a non-trivial fixed point, known as the 'one step RSB' solution. The same fixed 
point equations may also be obtained by different means, by considering a reconstruction problem 
on a tree Q]. The difference between a RS (replica-symmetric) solution and a IRSB solution in the 
tree case, is due to the absence or presence of a re- weighting factor in the fixed point equation Q]. 
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This re-weighting factor may be thought of as the term in Eqns. [T]and[2] (for e.g.), which involves 
the r] 's. Were we to replace the r] 's simply by 1/2 (their average value), then the recursion would 
give the number of solutions at the next level of the tree as simply the product over the previous 
level. The presence of the rj 's makes the recursion more non-trivial. The recursions in Sections IVII 
and IVIll too, keep this re- weighting factor 

Another difference in our work from the cavity method or SP is that the marginal probabilities 
are calculated level by level, instead of for each variable separately. This simplifies analysis, but 
makes our treatment valid strictly only for trees, unlike SP which is used to solve the SAT problem 
also on random graphs (though there are no guarantees of convergence in this case) . 

Our model shows many of the non trivial features of the full random K-SAT problem. To 
summarise, we find that different moments of the number of solutions start to decay at different 
values of d. These values of d are much larger than the value at which the solvability transition 
occurs. Also the solvability transition in our model is continuous for K = 2 and discontinuous for 
X > 2, as in random K-SAT. In addition the fixed point equations predict for K > 2, a, lower bound 
on the solvability transition in the interior of the tree. These numbers if converted to equivalent 
a values on the random graph are very close to the values predicted for in the literature for 
random graphs with an average node-degree Ka (to make a more literal correspondence to our 
tree where every variable has the same degree, it would be interesting to compare with ad values 
predicted for regular random graphs). We should note though that the existence of ad is motivated 
in the literature by the structure of the space of solutions in random K-SAT, while in this work, it 
arises as the point at which the recursions can have more than one solution, depending on boundary 
conditions. At this point, the fraction of realizations in which a variable is constrained to take the 
same value in all SAT assignments becomes non-zero. 

We can redo the computation of Eqns. HUas well as the fixed point analysis, for other variations 
of the SAT problem such as regular random K-SAT, introduced in [18] and for which bounds on 
the threshold are derived in [19(]. Preliminary results from the fixed point analysis show that 
the fixed point equations have a similar behaviour as for K-SAT, but predict smaller values of dc 
for the same K [20], as indeed is also the numerical prediction for the solvability transition for 
this problem It should also be possible to redo these calculations on a random tree (where 
the degree of each vertex is Poisson-distributed), though we do not expect the results to change 
qualitatively in this case. Another interesting generalization of these calculations is to compute a 
more fine-grained quantity than Pn(2), namely to compute the probability that, the root takes one 
of the two values a certain fraction j3 of the times. It would be interesting to see if this quantity 
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undergoes a transition in which (3 changes from essentiahy taking the value 1/2 to having a non- 
trivial distribution. Note that a similar calculation for random tree ensembles, with however an 
average done over boundary conditions chosen uniformly over all satisfying assignments, is done in 
21|. 
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